An MDL Estimate of the Significance of Rules
نویسندگان
چکیده
This paper proposes a new method for measuring the performance of models—whether decision trees or sets of rules—inferred by machine learning methods. Inspired by the minimum description length (MDL) philosophy and theoretically rooted in information theory, the new method measures the complexity of test data with respect to the model. It has been evaluated on rule sets produced by several different machine learning schemes on a large number of standard data sets. When compared with the usual percentage correct measure, it is shown to agree with it in restricted cases. However, in other more general cases taken from real data sets—for example, when rule sets make multiple or no predictions—it disagrees substantially. It is argued that the MDL measure is more reasonable in these cases. and represents a better way of assessing the significance of a rule set’s performance. The question of the complexity of the rule set itself is not addressed in the paper.
منابع مشابه
Application of statistical techniques and artificial neural network to estimate force from sEMG signals
This paper presents an application of design of experiments techniques to determine the optimized parameters of artificial neural network (ANN), which are used to estimate force from Electromyogram (sEMG) signals. The accuracy of ANN model is highly dependent on the network parameters settings. There are plenty of algorithms that are used to obtain the optimal ANN setting. However, to the best ...
متن کاملEstimate Output of a Production Unit in Production Possibility Set with Fuzzy Inference Mechanism
In this paper, we consider the production possibility set with n production units such that the following four principles that governs: inclusion observations, conceivability, immensity and convexity. Our goal is to estimate the output of a same and new production unit with existing production possibility and amount of input is specified. So, initially we find the interval changes of each input...
متن کاملA Novel DOA Estimation Approach for Unknown Coherent Source Groups with Coherent Signals
In this paper, a new combination of Minimum Description Length (MDL) or Eigenvalue Gradient Method (EGM), Joint Approximate Diagonalization of Eigenmatrices (JADE) and Modified Forward-Backward Linear Prediction (MFBLP) algorithms is proposed which determines the number of non-coherent source groups and estimates the Direction Of Arrivals (DOAs) of coherent signals in each group. First, the MDL...
متن کاملMoney Growth Rules in an Emerging Small Open Economy with an informal sector
This paper is concerned with the saddle-path stability of monetary growth rules in a two-country two-sector dynamic stochastic general equilibrium model. Alongside standard features of emerging economies, such as a combination of producer and local currency pricing for exports, fiscal dominance and oil exports, this model also incorporates informal labour and production sectors and examines how...
متن کاملAsymptotic MAP criteria for model selection
The two most popular model selection rules in the signal processing literature have been the Akaike’s criterion AIC and the Rissanen’s principle of minimum description length MDL. These rules are similar in form in that they both consist of data and penalty terms. Their data terms are identical, but the penalties are different, the MDL being more stringent toward overparameterization. The AIC p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996